For this final quiz, you will work with some data from the World Health Organization. There are two datafiles in Canvas that you will need (WHOLifeExpectancy.xlsx and WHO_metadata.csv). You will import, clean, and restructure the data and then use it to create an interactive choropleth map.

  1. Import the life expectancy dataset into R. Give each variable an appropriate name.

    • You may find it helpful to use fairly long names here for most of the variables. For example, the first column of numbers in the data represents healthy life expectancy at birth for both sexes combined in 2016. I named this column HALEbirth_Both sexes_2016.
    • Note: I used two underscores and one space in the name. I did that intentionally. It is usually a bad practice to include spaces in variable names, but these variable names will be eliminated in exercise 2, and this choice is helpful for exercise 2.
    • For each variable name, I patched together either “HALEbirth” or “HALE60” (for healthy life expectancy at age 60) along with “Both sexes”, “Male”, or “Female” and then the year. This structure should help with exercise 2…
    • Your dataset should have 183 observations and 31 variables.
  2. Restructure the data so that it has only five variables, named country, sex, year, HALEbirth, and HALLE60. The first few rows should look something like what is shown below. There should be 2745 rows; please print the first few rows of the data in your knitted document so that I can see them easily.

    ## # A tibble: 10 x 5
    ##    country     sex        year  HALEbirth HALE60
    ##    <chr>       <chr>      <chr>     <dbl>  <dbl>
    ##  1 Afghanistan Both sexes 2016       53     11.3
    ##  2 Afghanistan Male       2016       52.1   10.9
    ##  3 Afghanistan Female     2016       54.1   11.7
    ##  4 Albania     Both sexes 2016       68.1   16.3
    ##  5 Albania     Male       2016       66.7   15.3
    ##  6 Albania     Female     2016       69.6   17.4
    ##  7 Algeria     Both sexes 2016       65.5   15.8
    ##  8 Algeria     Male       2016       65.4   15.7
    ##  9 Algeria     Female     2016       65.6   15.8
    ## 10 Angola      Both sexes 2016       55.8   13.6
  3. Create a simple feature (sf) object which contains the geometries for creating a world map along with the 2016 life expectancy data for both sexes combined.

    • To keep the size of the datasets down, filter the life expectancy data to include only 2016 life expectancy data for both sexes combined before joining it with the map data.
    • As is commonly the case, trying to use the full names of the countries to join the life expectancy data with the map data results in a large number of mismatches.
    • ISO codes tend to be more effective for joining datasets about countries, but the life expectancy data does not include ISO codes. The WHO website does provide a separate file (available in Canvas as “WHO_metadata.csv”) which contains both the names of the countries and the ISO codes. Since it comes from the same source, this metadata file uses identical names for the countries as the life expectancy.
    • Import the metadata file into R. First join the life expectancy data with the metadata (using the names of the countries), then join this dataset with the map data (using the ISO codes).
  4. Create a choropleth map of the world with countries shaded by the healthy life expectancy at birth in 2016 for both sexes combined.

  5. Convert your map from question 4 into an interactive map using ggplotly(). Display the name of the country and the healthy life expectancy at birth when the user hovers over a country. Your final map might look something like this: